Generalizing Association Rules to Ordinal Rules

نویسندگان

  • Sylvie Guillaume
  • Ali Khenchaf
چکیده

The development of good measures of interestingness of the discovered rules is one of the important problems in data mining. Such measures of interestingness are divided into objective measures : – those that depend only on the structure of a rule and the underlying data used in the discovery process, and the subjective measures – those that depend on the class of users who examine the rule. However, most objective measures are suitable for binary attributes and require an appropriate transformation of the initial set of attributes into binary attributes for all unsupervised usual algorithms for the discovery of association rules. As a result, the complexity of these algorithms increases exponentially with the number of attributes, and this transformation can lead us, on the one hand to a combinatorial explosion, and on the other hand to a prohibitive number of weakly significant rules with many redundancies. Moreover, the few measures suitable for numeric attributes, like for example correlation coefficient, are not selective. In this paper, we propose a new objective measure, called ordinal intensity of implication, which generalizes intensity of implication suitable for binary attributes and which evaluates whether the number of transactions not clearly verifying rule X→Y (i.e., the number of transactions containing a high value for attribute X and a low value for attribute Y) is significantly small as compared to a random draw. We finish the study with an evaluation on banking data and show some discovered ordinal rules, and connection between data / information and quality.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Algorithm for the Discovery of Arbitrary Length Ordinal Association Rules

rule mining techniques are used to search attribute-value pairs that occur frequently together in a data set. Ordinal association rules are a particular type of association rules that describe orderings between attributes that commonly occur over a data set [9]. Although ordinal association rules are defined between any number of the attributes, only discovery algorithms of binary ordinal assoc...

متن کامل

Applying Ordinal Association Rules for Cleansing Data With Missing Values

Cleansing data of errors is an important processing step particularly when integrating heterogeneous data sources. Dirty data files are prevalent in data warehouses because of incorrect or missing data values, inconsistent attribute naming conventions or incomplete information. This paper improves the data cleansing ordinal association rules technique by proposing a solution for the missing val...

متن کامل

Quantitative and Ordinal Association Rules Mining (QAR Mining)

Association rules have exhibited an excellent ability to identify interesting association relationships among a set of binary variables describing huge amount of transactions. Although the rules can be relatively easily generalized to other variable types, the generalization can result in a computationally expensive algorithm generating a prohibitive number of redundant rules of little signific...

متن کامل

Some Quality Measures for Fuzzy Association Rules

Several approaches generalizing crisp association rules to fuzzy association rules have been proposed. In an our previous paper we introduced a pair of confidence measures for crisp association rules from which one can be obtained the majority known quality measures. In this paper, starting from these results we give an extension to fuzzy association rules.

متن کامل

Towards Association Rules with Hidden Variables

The mining of association rules can provide relevant and novel information to the data analyst. However, current techniques do not take into account that the observed associations may arise from variables that are unrecorded in the database. For instance, the pattern of answers in a large marketing survey might be better explained by a few latent traits of the population than by direct associat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000